Learning to Hash on Structured Data

نویسندگان

  • Qifan Wang
  • Luo Si
  • Bin Shen
چکیده

Hashing techniques have been widely applied for large scale similarity search problems due to the computational and memory efficiency. However, most existing hashing methods assume data examples are independently and identically distributed. But there often exists various additional dependency/structure information between data examples in many real world applications. Ignoring this structure information may limit the performance of existing hashing algorithms. This paper explores the research problem of learning to Hash on Structured Data (HSD) and formulates a novel framework that considers additional structure information. In particular, the hashing function is learned in a unified learning framework by simultaneously ensuring the structural consistency and preserving the similarities between data examples. An iterative gradient descent algorithm is designed as the optimization procedure. Furthermore, we improve the effectiveness of hashing function through orthogonal transformation by minimizing the quantization error. Experimental results on two datasets clearly demonstrate the advantages of the proposed method over several state-of-the-art hashing methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimal Loss Hashing for Compact Binary Codes

We propose a method for learning similaritypreserving hash functions that map highdimensional data onto binary codes. The formulation is based on structured prediction with latent variables and a hinge-like loss function. It is efficient to train for large datasets, scales well to large code lengths, and outperforms state-of-the-art methods.

متن کامل

Online Hashing

Although hash function learning algorithms have achieved great success in recent years, most existing hash models are off-line, which are not suitable for processing sequential or online data. To address this problem, this paper proposes an online hash model to accommodate data coming in stream for online learning. Specifically, a new loss function is proposed to measure the similarity loss bet...

متن کامل

An Improved Hash Function Based on the Tillich-Zémor Hash Function

Using the idea behind the Tillich-Zémor hash function, we propose a new hash function. Our hash function is parallelizable and its collision resistance is implied by a hardness assumption on a mathematical problem. Also, it is secure against the known attacks. It is the most secure variant of the Tillich-Zémor hash function until now.

متن کامل

Hash Kernels for Structured Data

We propose hashing to facilitate efficient kernels. This generalizes previous work using sampling and we show a principled way to compute the kernel matrix for data streams and sparse feature spaces. Moreover, we give deviation bounds from the exact kernel matrix. This has applications to estimation on strings and graphs.

متن کامل

A New Ring-Based SPHF and PAKE Protocol On Ideal Lattices

emph{ Smooth Projective Hash Functions } ( SPHFs ) as a specific pattern of zero knowledge proof system are fundamental tools to build many efficient cryptographic schemes and protocols. As an application of SPHFs, emph { Password - Based Authenticated Key Exchange } ( PAKE ) protocol is well-studied area in the last few years. In 2009, Katz and Vaikuntanathan described the first lattice-based ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015